# Soc design lab3-fir

### **Block Diagram**

# • Datapath-dataflow





### Control signals

### **Describe operation**

How to receive data-in and tap parameters and place into SRAM



fir 在讀入資料的同時會直接將資料寫入 SRAM

### How to access shiftram and tapRAM to do computation



為了能增加 performance,不直接 shift SRAM 裡面的資料,而是移動每次開始讀取 data ram 的位置。而且在需要進行計算的前一個 cycle 就先將所需資料的位置給 SRAM,到下一個 cycle 時,除了對 SRAM 傳來的資料進行計算外,也將下一 cycle 所需資料的位置給 SRAM,如此可以充分利用資源。

### • How each output y is computed.

計算一個 y 需要 11 個 clock cycle,tap\_idx 從 9 到 10。

9: 計算 tap10。開始計算一個新的 y,acc reset=1,重新累加

8~1: 計算 tap9~2

0: 計算 tap1。讀入一個 input x,並將其寫到 data ram

10: 計算 tap0。輸出 y,sm tvalid=1

對於前 11 個 y,因為 data ram 中的數值並未經過初始化,tap\_idx>data\_idx 來判斷要從 data ram 中讀取數值,還是直接用 0 進行計算

### • How ap\_done is generated.

當最後一筆輸出被 fir\_tb 接收後將 ap\_done 設為 1

#### Resource usage: including FF, LUT, BRAM

| +                       | + |      | + |       | +          | + | +                 |
|-------------------------|---|------|---|-------|------------|---|-------------------|
| I Site Type             | 1 | Used | 1 | Fixed | Prohibited | ĺ | Available   Util% |
| +                       | + |      | + |       | +          | + | +                 |
| Slice LUTs*             |   | 303  | 1 | 0     | Ι 0        | I | 53200   0.57      |
| LUT as Logic            |   | 303  | 1 | 0     | I 0        |   | 53200   0.57      |
| LUT as Memory           |   | 0    | 1 | 0     | Ι 0        | I | 17400   0.00      |
| Slice Registers         |   | 145  | 1 | 0     | I 0        |   | 106400   0.14     |
| I Register as Flip Flop |   | 145  | 1 | 0     | Ι 0        | I | 106400   0.14     |
| Register as Latch       | 1 | 0    | 1 | 0     | I 0        | I | 106400   0.00     |
| F7 Muxes                | 1 | 0    | 1 | 0     | I 0        | I | 26600   0.00      |
| F8 Muxes                |   | 0    | 1 | 0     | Ι 0        |   | 13300   0.00      |
| +                       | + |      | + |       | +          | + | +                 |

| Site Type                                    | İ    | Used        | İ            | Fixed       | İ           | Prohibited  | Available             | Util%                | İ     |
|----------------------------------------------|------|-------------|--------------|-------------|-------------|-------------|-----------------------|----------------------|-------|
| Block RAM Tile<br>  RAMB36/FIFO*<br>  RAMB18 | <br> | 0<br>0<br>0 | <u> </u><br> | 0<br>0<br>0 | i<br>I<br>I | 0<br>0<br>0 | 140<br>  140<br>  280 | 0.00<br>0.00<br>0.00 | j<br> |

從合成報告截圖中可以看出此設計並未使用 BRAM

### **Timing Report**

• Try to synthesize the design with maximum frequency

| Clock S  | Summary       |            |                | <br> |
|----------|---------------|------------|----------------|------|
| Clock    | Waveform(ns)  | Period(ns) | Frequency(MHz) |      |
| axis_clk | {0.000 5.000} | 12.000     | 83.333         |      |

# 經過測試最短週期設為 12ns

Report timing on longest path, slack

```
Slack (MET) :
                                0.269ns (required time - arrival time)
                               ____u_s_regre (rising edge-triggered cell FDRE clocked by axis_clk {rise@0.000ns fall@5.000ns period=12.000ns}) acc_reg[29]/D
                                mul_data_in_sel_reg/C
  Destination:
                                  (rising edge-triggered cell FDRE clocked by axis_clk {rise@0.000ns fall@5.000ns period=12.000ns})
                                axis_clk
  Path Group:
                               axis_cik

Setup (Max at Slow Process Corner)

12.000ns (axis_cik rise@12.000ns - axis_cik rise@0.000ns)

11.626ns (logic 8.472ns (72.868%) route 3.154ns (27.132%))

10 (CARRY4-5 DSP48E1-2 LUT2-2 LUT4-1)
  Path Type:
  Requirement:
  Data Path Delay:
  Logic Levels:
                                -0.145ns (DCD - SCD + CPR)
y (DCD): 2.128ns = (14.128 - 12.000)
  Clock Path Skew:
    Destination Clock Delay (DCD):
    Source Clock Delay
                                  (SCD):
                                              2.456ns
    Clock Pessimism Removal (CPR):
                                              0.184ns
  Clock Uncertainty:
                                          ((TSJ^2 + TIJ^2)^1/2 + DJ) / 2 + PE
                                0.035ns
    Total System Jitter
Total Input Jitter
                                  (TSJ):
                                              0.071ns
                                  (TIJ):
                                               0.000ns
                                   (DJ):
(PE):
    Discrete Jitter
                                              0.000ns
                                              0.000ns
    Phase Error
    Location
                              Delay type
                                                               Incr(ns) Path(ns)
                                                                                         Netlist Resource(s)
                              (clock axis clk rise edge)
                                                                              0.000 r
                                                                  0.000
                                                                              0.000 r axis_clk(IN)
                                                                  0.000
                                                                                         axis_clk
                              net (fo=0)
                                                                              r axis_clk_IBUF_inst/I
                                                                  0.972
                               IBUF (Prop_ibuf_I_0)
                                                                                      axis_clk_IBUF
r axis_clk_IBUF_BUFG_inst/I
                              net (fo=1, unplaced)
                                                                  0.800
                                                                              1.771
                              BUFG (Prop_bufg_I_O)
net (fo=145, unplaced)
                                                                              1.872 r axis_clk_IBUF_BUFG_inst/O
2.456 axis_clk_IBUF_BUFG
                                                                  0.101
                                                                  0.584
                              FDRE
                                                                                     r mul_data_in_sel_reg/C
                              FDRE (Prop_fdre_C_Q)
                                                                  0.478
                                                                              2.934 f mul_data_in_sel_reg/Q
                                                                              3.316 mul_data_in_sel
f acc2__0_i_1/I1
3.611 r acc2__0_i_1/O
                              net (fo=32, unplaced)
                                                                  0.382
                                                                  0.295
                              LUT2 (Prop_lut2_I1_0)
net (fo=1, unplaced)
                                                                  0.800
                                                                               4.411
                                                                                        acc3[16]
                                                                                      r acc2__0/A[16]
                              DSP48E1 (Prop_dsp48e1_A[16]_PCOUT[47])
                                                                  4.036
                                                                               8.447 r acc2__0/PCOUT[47]
                                                                  0.055
                              net (fo=1, unplaced)
                                                                              8.502
                                                                                      acc2__0_n_106
r acc2__1/PCIN[47]
                              DSP48E1 (Prop_dsp48e1_PCIN[47]_P[0])
                                                                              10.020 r acc2_1/P[0]
                              net (fo=2, unplaced)
                                                                  0.800
                                                                             10.820
                                                                                         acc2__1_n_105
sm_tdata_OBUF[19]_inst_i_13/10
                                                                             10.944 r sm_tdata_OBUF[19]_inst_i_13/0
10.944 sm_tdata_OBUF[19]_inst_i_13_n_0
                              LUT2 (Prop_1ut2_I0_0)
net (fo=1, unplaced)
                                                                  0.124
                                                                  0.000
                                                                                      r sm_tdata_OBUF[19]_inst_i_10/S[1]
                              CARRY4 (Prop_carry4_S[1]_CO[3])
                                                                             11.477 r sm_tdata_OBUF[19]_inst_i_10/C0[3]
11.486 sm_tdata_OBUF[19]_inst_i_10_n_0
r sm_tdata_OBUF[23]_inst_i_10/C1
                                                                  0 533
                              net (fo=1, unplaced)
                                                                  0.009
                              CARRY4 (Prop_carry4_CI_CO[3])
                                                                  0.117
                                                                              11.603 r sm_tdata_OBUF[23]_inst_i_10/C0[3]
                                                                                      sm_tdata_OBUF[23]_inst_i_10_n_0
r sm_tdata_OBUF[27]_inst_i_10/CI
                              net (fo=1, unplaced)
                                                                  0.000
                                                                             11.603
                              CARRY4 (Prop_carry4_CI_0[3])
                                                                  0.331
                                                                              11.934 r sm_tdata_OBUF[27]_inst_i_10/0[3]
                                                                                         sm_tdata_OBUF[27]_inst_i_10_n_4
acc[24]_i_2/I0
                              net (fo=3, unplaced)
                                                                  0.636
                                                                             12.570
                                                                             12.877 r acc[24]_i_2/0
                              LUT4 (Prop_lut4_I0_0)
                                                                  0.307
                                                                  0.473
                              net (fo=1, unplaced)
                                                                             13.350
                                                                                         in[27]
                                                                                      r acc_reg[24]_i_1/DI[3]
                              CARRY4 (Prop_carry4_DI[3]_CO[3])
                                                                              13.746 r acc_reg[24]_i_1/C0[3]
                                                                             13.746 acc_reg[24]_i_1_n_0
r acc_reg[28]_i_1/CI
                              net (fo=1, unplaced)
                                                                  0.000
                              CARRY4 (Prop_carry4_CI_0[1])
                                                                  0.337
                                                                              14.083 r acc_reg[28]_i_1/0[1]
                              net (fo=1, unplaced)
                                                                  0.000
                                                                             14.083
                                                                                         acc_reg[28]_i_1_n_6
                                                                                     3 acc_reg[28]_i_
r acc_reg[29]/D
                              FDRE
                              (clock axis_clk rise edge)
                                                                 12.000
                                                                             12.000 r
                                                                  0.000
                                                                              12.000 r axis_clk (IN)
                                                                                         axis_clk
axis_clk_IBUF_inst/I
axis_clk_IBUF_inst/O
axis_clk_IBUF
                              net (fo=0)
                                                                  0.000
                                                                             12.000
                              IBUF (Prop_ibuf_I_0)
net (fo=1, unplaced)
                                                                             12.838 r
                                                                  0.838
                                                                  0.760
                                                                             13.598
                                                                             r axis_clk_IBUF_BUFG_inst/I
13.689 r axis_clk_IBUF_BUFG_inst/O
14.128 axis_clk_IBUF_BUFG
                              BUFG (Prop_bufg_I_0)
                                                                  0.091
                              net (fo=145, unplaced)
FDRE
                                                                  0.439
                                                                                      r acc_reg[29]/C
                                                                             14.311
                              clock pessimism
                                                                  0.184
                                                                  -0.035
                              clock uncertainty
                                                                             14.276
                              FDRE (Setup_fdre_C_D)
                                                                  0.076
                                                                             14.352
                                                                                          acc_reg[29]
                              arrival time
                                                                             -14.083
```

0.269

slack

#### **Simulation Waveform:**

Coefficient program, and read back



從波形圖可以看到先對 fir 設定 data length 跟 coefficient,然後再把 coefficient 讀出來。接著 program ap\_start,fir 開始計算之後 fir\_tb 會持續 polling ap done。

• Data-in stream-in



可以看到 fir 每 11 個週期會讀入 1 筆資料

Data-out stream-out



可以看到 fir 每 11 個週期會輸出 1 筆資料

RAM access control



FSM

無 FSM

#### Simulation result

所有係數都正確

所有 output 跟 golden 相同 計算 600 筆資料共花費 6610 個 clock cycle